Skip to content

Conversation

@donoghuc
Copy link
Member

Release notes

[rn:skip]

What does this PR do?

As a follow up to #18471 use gradle as the entrypoint in CI for running acceptance tests. Note that this still uses rake, but it explicitly uses the bundled jruby to invoke it. I did play around a bit with trying to not shell out and instead use jruby through gradle, but I ran in to an issue I have seen before with file descriptors when the ruby code goes to itself shell out (and capture output). Similarly, i looked at using rspec directly and removing rake, but the current rake file does some requires that the rspec code needs. I think leaving this as a rake task is fine and this is a step forward that serves 1. ensuring gradle is "the" interface for orchestration tasks and 2. We use gradle to manage a ruby environment instead of assuming one is on system using the task orchestrator.

@github-actions
Copy link
Contributor

🤖 GitHub comments

Just comment with:

  • run docs-build : Re-trigger the docs validation. (use unformatted text in the comment!)
  • /run exhaustive tests : Run the exhaustive tests Buildkite pipeline.

@donoghuc
Copy link
Member Author

run exhaustive tests

@mergify
Copy link
Contributor

mergify bot commented Dec 30, 2025

This pull request does not have a backport label. Could you fix it @donoghuc? 🙏
To fixup this pull request, you need to add the backport labels for the needed
branches, such as:

  • backport-8./d is the label to automatically backport to the 8./d branch. /d is the digit.
  • If no backport is necessary, please add the backport-skip label

@donoghuc
Copy link
Member Author

run exhaustive tests

Copy link
Contributor

@andsel andsel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a comment if we could avoid to compose command lines and use methods to run JRuby, plus a note on the usage of a deprecated API.

build.gradle Outdated
exec {
workingDir "${projectDir}/qa"
environment "BUNDLE_PATH", "${projectDir}/qa/vendor/bundle"
commandLine "${projectDir}/vendor/jruby/bin/jruby", "-S", "bundle", "install"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it's part of the problem that you try to avoid, but IIRC to run Ruby scripts inside Gradle we have a method named executeJruby in rubyUtils.grade

Object executeJruby(File projectDir, File buildDir, Closure<?> /* Object*/ block) {
. Do you think worthwhile to use something similar, instead of composing every time a command line?

Project.exec is deprecated since Gradle 8.11, I would avoid to use deprecated API usage, since that would add tech debt:

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First of all, thanks for the pointer on the deprecated exec :)

Regarding why i'm shelling out here, I did a bit more of a deep dive in to WHY (and the reason may shock you)....

Backstory: I originally noticed this issue #18471 (comment) when building on test runners (notice how fpm wants to use file descriptors there to capture IO). At that time I just realized there was obviously a strageness there SO i decided i would just shell out early instead of trying to fight it.

With your comment here I decided it was time to do some investigation. Here is what i came up with:

As a minimal repro:

# build.gradle 
tasks.register("testFdInheritance") {
    dependsOn bootstrap
    doLast {
        rake(projectDir, buildDir, 'test:fd')
    }
}
# fd_test.rake
task 'test:fd' do
  r, w = IO.pipe
  puts "Pipe FDs: r=#{r.fileno}, w=#{w.fileno}"
  pid = Process.spawn("true", :out => w)
  Process.wait(pid)
end
bash-4.4$ ./gradlew testFdInheritance
To honour the JVM settings for this build a single-use Daemon process will be forked. For more on this, please refer to https://docs.gradle.org/8.11.1/userguide/gradle_daemon.html#sec:disabling_the_daemon in the Gradle documentation.
Daemon will be stopped at the end of the build

> Task :downloadJRuby UP-TO-DATE
Download https://repo1.maven.org/maven2/org/jruby/jruby-dist/9.4.13.0/jruby-dist-9.4.13.0-bin.tar.gz

> Task :testFdInheritance FAILED
Pipe FDs: r=100003, w=100004
Rake task error: Errno::EBADF: Bad file descriptor - true
Backtrace: org/jruby/RubyProcess.java:1779:in `spawn'
/opt/buildkite-agent/logstash/rakelib/fd_test.rake:4:in `block in <main>'
/opt/buildkite-agent/logstash/vendor/bundle/jruby/3.1.0/gems/rake-13.3.1/lib/rake/task.rb:281:in `block in execute'
org/jruby/RubyArray.java:2009:in `each'
/opt/buildkite-agent/logstash/vendor/bundle/jruby/3.1.0/gems/rake-13.3.1/lib/rake/task.rb:281:in `execute'
/opt/buildkite-agent/logstash/vendor/bundle/jruby/3.1.0/gems/rake-13.3.1/lib/rake/task.rb:219:in `block in invoke_with_call_chain'
org/jruby/ext/monitor/Monitor.java:82:in `synchronize'
/opt/buildkite-agent/logstash/vendor/bundle/jruby/3.1.0/gems/rake-13.3.1/lib/rake/task.rb:199:in `invoke_with_call_chain'
/opt/buildkite-agent/logstash/vendor/bundle/jruby/3.1.0/gems/rake-13.3.1/lib/rake/task.rb:188:in `invoke'
<script>:6:in `<main>'

FAILURE: Build failed with an exception.

* Where:
Script '/opt/buildkite-agent/logstash/rubyUtils.gradle' line: 159

* What went wrong:
Execution failed for task ':testFdInheritance'.
> (null) Bad file descriptor - true

* Try:
> Run with --stacktrace option to get the stack trace.
> Run with --info or --debug option to get more log output.
> Run with --scan to get full insights.
> Get more help at https://help.gradle.org.

BUILD FAILED in 33s
39 actionable tasks: 4 executed, 35 up-to-date

As you can see somehow we get these crazy "high" file descriptor values r=100003, w=100004 which Process.spawn freaks out on. Something about how gradle/jruby interact there the file descriptors have some sort of virtual or proxy value that dont work when ruby wants to use them.

You can see that forking early "fixes" it (r=19, w=20)

# build.gradle
tasks.register("testFdStandalone", Exec) {
    dependsOn bootstrap
    workingDir projectDir
    environment "GEM_HOME", "${projectDir}/vendor/bundle/jruby/3.1.0"
    environment "GEM_PATH", "${projectDir}/vendor/bundle/jruby/3.1.0"
    commandLine "${projectDir}/vendor/jruby/bin/jruby", "-S", "rake", "test:fd"
}
bash-4.4$ ./gradlew testFdStandalone
To honour the JVM settings for this build a single-use Daemon process will be forked. For more on this, please refer to https://docs.gradle.org/8.11.1/userguide/gradle_daemon.html#sec:disabling_the_daemon in the Gradle documentation.
Daemon will be stopped at the end of the build

> Task :downloadJRuby UP-TO-DATE
Download https://repo1.maven.org/maven2/org/jruby/jruby-dist/9.4.13.0/jruby-dist-9.4.13.0-bin.tar.gz

> Task :testFdStandalone
Pipe FDs: r=19, w=20

BUILD SUCCESSFUL in 43s
39 actionable tasks: 4 executed, 35 up-to-date

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this jruby/jruby#8555

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So it looks like there is a workaround other than shelling out early...

From jruby we see how the "fake" file descriptors are managed: https://github.com/jruby/jruby/blob/bc973881b293b1fade48399c4b9b62e21d962351/core/src/main/java/org/jruby/util/io/FilenoUtil.java#L231

We miss an important warning message when that cant be found:

https://github.com/jruby/jruby/blob/bc973881b293b1fade48399c4b9b62e21d962351/core/src/main/java/org/jruby/util/io/FilenoUtil.java#L272-L282

namely this part:

                    LOG.warn("Native IO integration requires open access to the JDK IO subsystem\n" +
                            "Pass '--add-opens java.base/sun.nio.ch=" + moduleName + " --add-opens java.base/java.io=" + moduleName + "' to enable.");

I verified that actually fixes the problem:

bash-4.4$ ./gradlew testFdInheritance -Dorg.gradle.jvmargs="--add-opens=java.base/sun.nio.ch=ALL-UNNAMED --add-opens=java.base/java.io=ALL-UNNAMED"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So to invoke Gradle and do not have the problem do we need to open the access to modules java.base/sun.nio.ch. and java.base/java.io ?

I would avoid to pass those string to CLI, we need to update a lot of scripts that uses Gradle. Did you tried adding those statements to org.gradle.jvmargs contained in gradle.properties? It would require a good explanation and link to the JRuby issue, for future us to remove or understand why :-(

However, nice work in digging into.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On my side, if it's simpler to shell out (with a good explanation) do that.

@donoghuc
Copy link
Member Author

donoghuc commented Jan 2, 2026

run exhaustive tests

@donoghuc
Copy link
Member Author

run exhaustive tests

@donoghuc donoghuc closed this Jan 12, 2026
@donoghuc donoghuc reopened this Jan 12, 2026
@donoghuc
Copy link
Member Author

run exhaustive tests

@donoghuc donoghuc force-pushed the refactor-acceptance-test-hook branch from 175b7e8 to b206e98 Compare January 20, 2026 21:56
@donoghuc donoghuc added the backport-active-all Automated backport with mergify to all the active branches label Jan 21, 2026
Copy link
Contributor

@mashhurs mashhurs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, pinging if @andsel has any concerns.

@andsel andsel self-requested a review January 22, 2026 08:47
Copy link
Contributor

@andsel andsel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a couple nitpicks plus a question about a deprecation warning, if you could clarify it.

build.gradle Outdated

tasks.register("installAcceptanceTestGems", Exec) {
dependsOn bootstrap
inputs.files file("${projectDir}/qa/Gemfile")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
inputs.files file("${projectDir}/qa/Gemfile")
inputs.file("${projectDir}/qa/Gemfile")

Given qa/Gemfile is a single file, we could use straight the file method.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. Updated.

build.gradle Outdated
inputs.files file("${projectDir}/qa/Gemfile")
outputs.file("${projectDir}/qa/Gemfile.lock")
workingDir "${projectDir}/qa"
commandLine "${projectDir}/vendor/jruby/bin/jruby", "-S", "bundle", "install", "--path", "vendor/bundle"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Executing

>rm qa/Gemfile.lock
>./gradlew clean installAcceptanceTestGems

(Gradle's clean task doesn't remove the Gemfile.lock, and maybe that's expected).

I saw the following deprecation notice from bundler:

> Task :installAcceptanceTestGems
[DEPRECATED] The `--path` flag is deprecated because it relies on being remembered across bundler invocations, which bundler will no longer do in future versions. Instead please use `bundle config set path 'vendor/bundle'`, and stop using this flag

There is any plan to avoid this deprecation?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I did not notice that deprecation. We can just use the env var I think (I believe i used --path at one point because i thought it was providing isolation I needed).

I'm not convinced gradle clean should remove the qa/Gemfile.lock. Though it looks like maybe that has been a problem before

# The acceptance test in our CI infrastructure doesn't clear the workspace between run
# this mean the lock of the Gemfile can be sticky from a previous run, before generating any package
# we will clear them out to make sure we use the latest version of theses files
# If we don't do this we will run into gem Conflict error.
[ -f Gemfile ] && rm Gemfile
[ -f Gemfile.lock ] && rm Gemfile.lock

For now I'm tempted to just leave it in and let the test setup scripts do any cleanup they need.

Copy link
Contributor

@andsel andsel Jan 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I partially agree with you, leave this qa/Gemfile.lock clean up outside of this PR, it's out of scope, but I think we have to discuss.

My point is that Gemfile.lock is not versioned, and ./gradlew clean should move the project status back to "as just freshly checkout-ed", that's why I think the clean should remove this lock file.

The sentence in docker_acceptance_test.sh

infrastructure doesn't clear the workspace between run  this mean the lock of the Gemfile can be sticky from a previous run

means that it tries to fix a lack of reproducible builds. But let's discuss this with team or in a follow up issue.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is a good point. I'll make a note to come back to this.

build.gradle Outdated
dependsOn installAcceptanceTestGems
workingDir "${projectDir}/qa"
// NOTE: We sub-process here to avoid issues in java 21+ with Jruby whereby access to some modules that
// are required for file descriptors to be properly inherieted when ruby does sub-processing.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// are required for file descriptors to be properly inherieted when ruby does sub-processing.
// are required for file descriptors to be properly inherited when ruby does sub-processing.

@donoghuc donoghuc force-pushed the refactor-acceptance-test-hook branch from 4748ec8 to f166c5d Compare January 22, 2026 18:45
@donoghuc
Copy link
Member Author

run exhaustive tests

@donoghuc
Copy link
Member Author

run exhaustive tests

@elasticmachine
Copy link

💛 Build succeeded, but was flaky

Failed CI Steps

History

@donoghuc donoghuc requested a review from andsel January 22, 2026 23:17
@donoghuc
Copy link
Member Author

Thanks for the review @andsel I pushed an update and verified exhaustive tests are working still!

Copy link
Contributor

@andsel andsel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@donoghuc donoghuc merged commit 7e9c384 into elastic:main Jan 23, 2026
13 checks passed
@github-actions
Copy link
Contributor

@Mergifyio backport 8.19 9.2 9.3

@mergify
Copy link
Contributor

mergify bot commented Jan 23, 2026

backport 8.19 9.2 9.3

✅ Backports have been created

Details

Cherry-pick of 7e9c384 has failed:

On branch mergify/bp/8.19/pr-18536
Your branch is up to date with 'origin/8.19'.

You are currently cherry-picking commit 7e9c38455.
  (fix conflicts and run "git cherry-pick --continue")
  (use "git cherry-pick --skip" to skip this patch)
  (use "git cherry-pick --abort" to cancel the cherry-pick operation)

Changes to be committed:
	modified:   build.gradle
	modified:   ci/acceptance_tests.sh

Unmerged paths:
  (use "git add <file>..." to mark resolution)
	both modified:   qa/rspec/commands/base.rb

To fix up this pull request, you can check it out locally. See documentation: https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/reviewing-changes-in-pull-requests/checking-out-pull-requests-locally

mergify bot pushed a commit that referenced this pull request Jan 23, 2026
* WIP: just test on remote linux box.

* fixup compilation bugs

* actually use it...

* expose rubyUtils methods

* actual gem path

* try to make gems available

* add jruby stdlib gems to gem_path

* try different idea

* try shelling out instead

* use rake still

* revert rubyUtils.gradle changes

* explicitly set BUNDLE_PATH

* dont use deprecated exec

* stop using env var for bundle_path

* unset any bundle related env var before invoking LS

* unset not available on test runners

* document why we shell out

* codereview changes

* make sure BUNDLE_PATH is set after bundle install

(cherry picked from commit 7e9c384)

# Conflicts:
#	qa/rspec/commands/base.rb
mergify bot pushed a commit that referenced this pull request Jan 23, 2026
* WIP: just test on remote linux box.

* fixup compilation bugs

* actually use it...

* expose rubyUtils methods

* actual gem path

* try to make gems available

* add jruby stdlib gems to gem_path

* try different idea

* try shelling out instead

* use rake still

* revert rubyUtils.gradle changes

* explicitly set BUNDLE_PATH

* dont use deprecated exec

* stop using env var for bundle_path

* unset any bundle related env var before invoking LS

* unset not available on test runners

* document why we shell out

* codereview changes

* make sure BUNDLE_PATH is set after bundle install

(cherry picked from commit 7e9c384)
mergify bot pushed a commit that referenced this pull request Jan 23, 2026
* WIP: just test on remote linux box.

* fixup compilation bugs

* actually use it...

* expose rubyUtils methods

* actual gem path

* try to make gems available

* add jruby stdlib gems to gem_path

* try different idea

* try shelling out instead

* use rake still

* revert rubyUtils.gradle changes

* explicitly set BUNDLE_PATH

* dont use deprecated exec

* stop using env var for bundle_path

* unset any bundle related env var before invoking LS

* unset not available on test runners

* document why we shell out

* codereview changes

* make sure BUNDLE_PATH is set after bundle install

(cherry picked from commit 7e9c384)
@donoghuc
Copy link
Member Author

@Mergifyio backport 9.1

@mergify
Copy link
Contributor

mergify bot commented Jan 23, 2026

backport 9.1

✅ Backports have been created

Details

mergify bot pushed a commit that referenced this pull request Jan 23, 2026
* WIP: just test on remote linux box.

* fixup compilation bugs

* actually use it...

* expose rubyUtils methods

* actual gem path

* try to make gems available

* add jruby stdlib gems to gem_path

* try different idea

* try shelling out instead

* use rake still

* revert rubyUtils.gradle changes

* explicitly set BUNDLE_PATH

* dont use deprecated exec

* stop using env var for bundle_path

* unset any bundle related env var before invoking LS

* unset not available on test runners

* document why we shell out

* codereview changes

* make sure BUNDLE_PATH is set after bundle install

(cherry picked from commit 7e9c384)
donoghuc added a commit that referenced this pull request Jan 23, 2026
…) (#18633)

* WIP: just test on remote linux box.

* fixup compilation bugs

* actually use it...

* expose rubyUtils methods

* actual gem path

* try to make gems available

* add jruby stdlib gems to gem_path

* try different idea

* try shelling out instead

* use rake still

* revert rubyUtils.gradle changes

* explicitly set BUNDLE_PATH

* dont use deprecated exec

* stop using env var for bundle_path

* unset any bundle related env var before invoking LS

* unset not available on test runners

* document why we shell out

* codereview changes

* make sure BUNDLE_PATH is set after bundle install

(cherry picked from commit 7e9c384)

Co-authored-by: Cas Donoghue <cas.donoghue@gmail.com>
donoghuc added a commit that referenced this pull request Jan 23, 2026
…) (#18634)

* WIP: just test on remote linux box.

* fixup compilation bugs

* actually use it...

* expose rubyUtils methods

* actual gem path

* try to make gems available

* add jruby stdlib gems to gem_path

* try different idea

* try shelling out instead

* use rake still

* revert rubyUtils.gradle changes

* explicitly set BUNDLE_PATH

* dont use deprecated exec

* stop using env var for bundle_path

* unset any bundle related env var before invoking LS

* unset not available on test runners

* document why we shell out

* codereview changes

* make sure BUNDLE_PATH is set after bundle install

(cherry picked from commit 7e9c384)

Co-authored-by: Cas Donoghue <cas.donoghue@gmail.com>
donoghuc added a commit that referenced this pull request Jan 23, 2026
…) (#18639)

* WIP: just test on remote linux box.

* fixup compilation bugs

* actually use it...

* expose rubyUtils methods

* actual gem path

* try to make gems available

* add jruby stdlib gems to gem_path

* try different idea

* try shelling out instead

* use rake still

* revert rubyUtils.gradle changes

* explicitly set BUNDLE_PATH

* dont use deprecated exec

* stop using env var for bundle_path

* unset any bundle related env var before invoking LS

* unset not available on test runners

* document why we shell out

* codereview changes

* make sure BUNDLE_PATH is set after bundle install

(cherry picked from commit 7e9c384)

Co-authored-by: Cas Donoghue <cas.donoghue@gmail.com>
donoghuc added a commit that referenced this pull request Jan 23, 2026
…tests orchestration (#18632)

* Use gradle and bundled jruby for acceptance tests orchestration (#18536)

* WIP: just test on remote linux box.

* fixup compilation bugs

* actually use it...

* expose rubyUtils methods

* actual gem path

* try to make gems available

* add jruby stdlib gems to gem_path

* try different idea

* try shelling out instead

* use rake still

* revert rubyUtils.gradle changes

* explicitly set BUNDLE_PATH

* dont use deprecated exec

* stop using env var for bundle_path

* unset any bundle related env var before invoking LS

* unset not available on test runners

* document why we shell out

* codereview changes

* make sure BUNDLE_PATH is set after bundle install

(cherry picked from commit 7e9c384)

# Conflicts:
#	qa/rspec/commands/base.rb

* fix merge conflict

---------

Co-authored-by: Cas Donoghue <cas.donoghue@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport-active-all Automated backport with mergify to all the active branches

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants